A Data-driven Adaptation of Prosody in a Multilingual TTS

نویسندگان

  • Janez Stergar
  • Çaglayan Erdem
  • Bogomir Horvat
  • Zdravko Kacic
چکیده

Proper accentuation and phrasing make the syntactic and semantic structure of the message more transparent to the listener. Therefore a good modeling of prosody in a TTS system has to be structured into appropriate levels. The implemented prosodic hierarchy should guide the listeners’ attention and help in support of the comprehension process. Since prosody functions as a distractor, it is very important to build the prosody module in a TTS system very carefully. With the goal towards improvements of naturalness a concept of a selective hierarchical approach of prominence disambiguation and symbolic modeling will be introduced. The selective statistically based prominence disambiguation and prediction concept will be discussed and the implementation of the neural network (NN) module for prediction of symbolic tags into a multilingual TTS system introduced. We’ll conclude with prediction results and a suitability test of the introduced selective approach based on preliminary acoustical tests performed in a multilingual TTS.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptable Acoustic Architecture in a Multilingual TTS System

In this paper an adaptable acoustical architecture in a multilingual TTS system is presented. The whole architecture is designed to be a data-driven system. Modules comprising text preprocessing, grapheme-to-phoneme conversion, lexical stress detection, OOV-handling, symbolic prosody prediction, acoustic prosody prediction and unit selection with concatenation use machine learning techniques es...

متن کامل

Learning the parameters of quantitative prosody models

The article introduces a novel hybrid data driven and rule based approach for the prosody control in a TTS system, which combines the advantages of well-balanced, quantitative models with the flexible training of derived model parameters. Instancing the training of Fujisaki intonation parameters for German (MFGI) the article describes the hybrid data driven and rule based architecture HYDRA, th...

متن کامل

A Novel Prosody Adaptation Method for Mandarin Concatenation- Based Text-to-speech System

The paper presents a prosody adaptation method which is able to adapt the prosody model of text to speech (TTS) to a new style with a small training corpus. Unlike the conventional prosody mapping between two parallel prosody features, the paper tries to integrate the prosody conversion into the prosody generation model of TTS. In the paper, we use a template based prosody model which consists ...

متن کامل

A metrical model of prosody for French TTS

The model of prosody used for French TTS in the Aculab TTS system is unusual in several respects. Firstly, it is based firmly on current metrical theories of French prosody. Secondly, it is entirely knowledge-based: there are no stochastic components in the model. Thirdly, it makes use of a pseudo-random element to avoid the predictability of synthetic prosody. Fourthly, it is designed to facil...

متن کامل

Design of Multilingual Speech Synthesis System

The main objective of this paper is to convert the written multilingual text into machine generated synthetic speech. This paper is proposed in order to provide a complete multilingual speech synthesizer for three languages Indian English, Tamil and Telugu. The main application of TTS system is that it will be helpful for blind and mute people that they could have the text read to them by compu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004